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ABSTRACT 

Scholastic achievement tests and mental ability tests 
normally consist of a set of multiple choice items, all of which are 
assumed to measure school-relevant cognitive abilities. The 
presumption, in a given test situation, - that the answers/solutions 
to i-he given tasks represent cognitive capabilities on the part of 
the examinees. The purpose of this paper is to show that this 
assumption does not always hold. Analyzing simulated and empirical 
data it is proved that, based on the mixed Rasch model (Rest, 1990), 
it is possible to identify those examinees who have applied a 
guessing strategy to solve multiple choice items. As an empirical 
example the results in a biology test consisting of 23 items, each 
having 5 choices were analyzed. On the basis of the responses from 
5,641 7th grade students a guessing class was identified. Further 
analyses provided information indicating that guessing behavior is 
shown by students with lower-level cognitive abilities, who might 
have used the "random strategy" to cope with the items that were too 
difficult. Five tables and four figures provide study data. (Contains 
21 references.) (Author/AA) 
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Scholastic achievement tests and mental ability tests normaUy consist of a set of 
multiple choice items, aU of which are assumed to measure school-relevant cognitive 
abilities The presumption, in a given test situation, is that the answers /solutions to 
the given tasks represent cognitive capabilities on the part of the examinees. Our 
current purpose is to show that this assumption does not always hold. Analyzing 
simulated and empirical data we proved that, based on the mixed Rasch model (Rost, 
1990) it is possible to identifiy those examinees who have appUed a guessing strategy 
to solve multiple choice items. As an empirical example we analyzed the results in a 
biology test consisting of 23 items, each having 5 choices; and on the basis of the 
resp(^ses from 5641 7th grade students we identified a guessing class. Farther 
analyses provided information indicating that guessing behavior is shown by students 
with lower-level cognitive abilities, who might have used the "random strategy to cope 
with the items that were too diflflcult. 
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Identification of Guessing Behavior in Achievement Tests on the Basis of the MRM 2 

1. Introduction 

When creating and applying new achievement tests witli multiple choice items or 
questionnaires with rating scales educational researchers assume that these 
instruments are suitable to measure the same trait for all subjects tested. 
Researchers are pleased, if ensuing item analyses reveal that a measurement m'.'iel 
witli only one latent dimension holds for the data, since this facilitates interpretation 
of individual test scores. Often some examinees of a given sample do not show the 
expected response behavior, however, for instance in an achievement test where some 
subjects guess, or in a questionnaire where some subjects show special response sets 
such as the tendency toward the mean or the tendency toward extreme Judgements . 
Rost (1994) lab led these examinees as "unscalables" due to their deviant response 
behavior, which is beyond the scope of Item response models like the ordinary Rasch 
model (Rasch, 1960). Rost (1994) Rost and Georg (1991) and Rost an-' Davler (1993) 
5»uggested the mixed Rasch model (MRM) as a powerful tool In dealing with 
"unscalables" In data sets. The purpose of our current work Is to Introduce this model 
as a method for Identifying guessing behavior in achievement tests. 

Scholastic achievement tests and mental ability tests normally consist of a set of 
multiple choice Items, all of which are assumed to measure school-relevant cognitive 
abilities. The presumption, in a given test situation. Is that the answers /solutions to 
the given tasks represent cognitive capabilities on the part of the examinees. Koeller, 
Rost and Koeller (1994) demonstrate that this assumption does not always hold. In 
their study which dealt with individual differences In solving spatial tasks, the 
authors administered cube tasks to 2558 7th grade students and, applying a latent 
class analysis to the data, concluded that some exEuninees (16% of the whole sample) 
employed a guessing strategy to solve the tasks. 

Based upon these results, our current questions are: Do we find such an undeslred 
guessing behavior in other achievement tests, and Is the MRM a suitable statistical 
method for Identifying subjects who guessed? To answer these questions we will first 
Introduce the MRM and other procedures to model guessing behavior in Item 
Response Theory. Next, we will analyze a simulated and an empirical data set on the 
basis of the MRM with the PC-program MIRA (Rost & Davler, 1992). Contrary to usual 
literature, which deals only with the psychometric or statistical issues of guessing, we 
will place additional emphasis on the relationships between guessing behavior, 
motivational and cognitive variables. In regard to cognitive variables we will test the 
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plausible hypothesis that guessing behavior Is applied by students with lower-level 
cognitive abilities. 

In the sequel, we will give a short introduction to the MRM. To promote a better 
understanding of the MRM concept, some basics of the ordinary dichotomous Rasch 
model (RM; Rasch, 1960) and of the latent class analysis (LCA; Laizarsfeld. 1950; 
Lazarsfeld & Henry, 1968; Rost, 1988) a:re first presented. 



1.1 The Dichotomous Rasch Model (RM) 



Let p(XuO denote the response probability (probability of success) of person v on item i. 
The main idea of the dichotomous RM is to decompose ptx^O into a linear combination 
of an item parameter (difficulty) and an individual parameter (person's ability). Since 
the manifest variable varies only between zero and one, not the probability itself but 
the logit of this probability is decomposed: 

In — —] «§v+o:/ 

\l'P(Xvi)) 



i.e.: The logit of the response probability is equal to the sum of a person's ability 
and the item difficulty aj. 



As an easy transformation of Equation (1) the better-knovm response function of the 
Rasch model results: 

p(Xvi) « Tz T (2) 

1+ 



This relationship between latent variables and the response probability is often 
represented by the so-called Item Characteristic Curve (ICC), shown in Figure 1 . 

Although this function is nonlinear, it is evident from Figure 1 that the relationship 
between a person's ability and the probability of success is nearly linear in most 
segments of the latent continuum. Procedures for parameter estimation and 
goodness-of-flt tests are described in the relevant literature (e.g. Hambleton &. 
Swamlnathan, 1989; Wright & Masters. 1982). Fundamental assumptions of the RM 
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are (1) local Independence of the items, (2) homogeneity of persons and items and (3) 
specific objectivity. This last means that item parameter estimations are Independent 
of the group of examinees drawn from the population of examinees and that persons' 
ability estimations are independent of the particular choice of test items drawn from 
the population of items. These strong model assumptions are often violated, making 
other models of Item Response Theory (IRT) more attractive. 




-6 0 6 ?v 

Figure 1 

Item Characteristic Curve in the Rasch model 



1,2 The Latent Class Analysis (LCA) for Dichotomous Data 

The LCA is also an IRT-model, in which tlie multivariate relations among observed 
categorical variables are explained by the influence of a latent nominal variable. The 
response probability of person v on a dichotomous item i is now defined as 

G 

S (3) 

G 

with restriction 2^8=h 

8 
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where G is the number of subpopulations or latent classes, 7ig a probability panimeter 
defining the size of class g and ni\g the probability of success on itena i within class g. 
In the case of m observed variables the probabilty of a person s response pattern can 
be described as 

G m 

S ' (5) 

The multiplication of the conditional probabilities 7ii\g within the latent c]:=*sses follows 
from the assumption of local independence of the items. The estimation of the 
unknown parameters Jtg and ni\g is usually performed with the EM -Algorithm, 
described in detail by Rost (1988). A model with G classes fits the empirical data 
perfectly when all observed variables are independent within each class. To assess the 
model fit of a given solution with G latent classes, two goodness-of-fit statistics are 
computed: Akaiike's Information Criterion and the Best Information Criterion (BIC; 
Bozdogan, 1987): 

AIC = -2 log(L) + 2 /c, 
BIC = -2 log(L) + log(N) k, 

where L is the maximum of the likelihood-function, k the number of estimated 
independent parameters and JV the sample size; the smaUer the goodness -of-flt 
indices, the better a model with G classes fits. Particularly with larger sample sizes, 
the BIC appears to produce more valid results than the AIC. 

The usual significance tests, i.e. the Pearson x^-test for comparing observed and 
expected pattern frequencies or the likelihood ratio test for comparing different 
models, are normally not applicable. Both tests have similar asymptotic requirements. 
The x^'test requires expected frequencies greater or equal to one for all possible 
response patterns, which is usually not fullfllled. In the case cf more than 8 
dichotomous items 2® =256 possible response patterns can occur and researchers 
need very large sample sizes to apply the ;^-test to the data. Unfortunately, the same 
problem arises when appl3tog a Ukelihood-ratio test to compare two different class 
solutions. The likelihood -ratio statistic, derived from the likelihood of a model with G 
classes divided by the likelihood of a model with G+J classes, is only asjmiptotically 
>j2 -distributed if aU possible response patterns have a reasonable chance of appearing. 
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The advantage of the LCA over the RM is that different sets of item parameters are 
allowed in different classes, A disadvantage of the LCA is that it does not allow any 
variation between the persons' parameters (abilities) within a latent class. This 
assumption of constant response probabUlties for all individuals in a latent class has 
proven too restrictive for many purposes, 

1.3 The Mixed Rasch Model (MRM) 

The MRM "combines the theoretical strength of the Rasch model with the heuristic 
power of latent class analysis. It assumes that the Rasch model holds for all persons 
within a latent class, but it allows for different sets of item parameters between the 
latent classes" (Rest, 1990, p. 271). Thus, the MRM Is the supermodel of both the LCA 
and the RM. The response function of the MRM is described by the following equation: 

T 'l+e(^*+«'^) (6) 

The response probability 7ti\g from Equation (4) is now rewritten in accordance with 
the RM. It is obvious that in the case of only one latent class the MRM is reduced to 
the ordinary RM. In the case of different classes but without variation of the ability 
parameters within each class, the MRM correspondingly becomes a simple LCA model. 

The parameters of the MRM can be estimated by meauis of an extended EM-algorithm 
with conditional maximum likelihood estimations of the item parameters in the M- 
step (see Rost, 1990, 1994). To assess the model fit of a given solution witli G latent 
classes, again the AlC and BIC-index are computed. 

2. IdentiHcation of Guessing Behavior on the Basis of Item 
Respose Theory (IRT) 

Achievement tests standardly consist of multiple choice items, each having J choices. 
A person's probability of solving such Items by guessing is p=i /J. If there is any 
subsample of persons who have guessed in a given data set, the assumption of the 
ordinary RM that all item difficulties are constant for all persons is no longer valid. 
Those guessing examinees' item parameters will be different from those of the 
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remaining persons who use cognitive skills to solve the items. In correspondence with 
this deviant behavior of a subsample a significance test (for instance Andersen's 
likelihood-ratio test) will indicate that the RM does not hold for the whole data set. If 
the guessing examinees are excluded, the RM will fit the residual sample's data. 

A common strategy for indirectly infering guessing behavior from given response 
vectors of N subjects hais two steps. In the first step it is assumed that the RM is valid 
for all persons, and the model parameters are then estimated. On the basis of these 
estimates an ICC is plotted for each item. In the second step the proportion of correct 
answers for each score (ability) group is plotted against the ICC of a difficult item. 
"Guessing behavior is assumed to be operating when test performance for the low 
performing score groups exceeds zero". (Hambleton & Swaminathan. 1989, p. 161). 
Figure 2 depicts such a graph, where the proportions of .orrect answers (indicated by 
dots) are plotted against the ICC, which indicates the response probability under the 
assumption that the RM is valid. 

As a consequence of these deviations between empirical and expected success 
probabilities, educational researchers and psychometric ians often apply the three- 
parameter logistic model (Bimbaum, 1968) whereby the probabilty of person v to solve 
item i ist described by: 

/;UvO=yff(l^yO ^^^^,(g,,,,) . (7) 

Pi is the so-called discrimination parameter of item L yi, which is more im^wtant for 
our current purpose, is the so-called guessing parameter.^ The introduction of a 
guessing parameter yi consequentially means that the lower asymptote of the ICC is in 
general greater than zero. Thus, the ICC in Figure 2 would converge with decreasing 
abilities a^inst y< but not zero. Actually, yi is normally less than the real guessing 
probability which in the case of a multiple choice item with J choices is p=J /J. This 
phenomenon is explained by Lord (1974). who argued that some examinees with low 
abihties do not guess in the case of a difficult item but choose the most attractive 
distractor of the item. 



^Pl describes) varlaUons In the discrimination power of different Items. In the ordinary RM all ICCs are 
nonlntersecting curves that differ only by a translation along the the latent continuum. These Items vary 
only in theU* difficulty. The twj -parameter model addltlonalK^ allows a variation among the slopes of the 
ICCs, that Is a variation among the discrimination powers of different Items. The higher ft. the better the 
discrimination power of item I. 
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low 



high W 



Ability 



Figure 2 

Plot of expected solving probabilities (represented by the ICC) and empirical proportions of 
correct answers (Indicated by dots) for an item which provoked guessing behavior In low ability 

groups 



Problems or disadvantages of the three-parameter model as pointed out by Kubinger 
(1988) are: 

(1) The assumption of specific objectivity is abandoned, i.e. estimations of 
individual parameters are no longer Indepjendent of the given subset of items. 

(2) The parameters estimated by means of the unconditional maximum likelihood 
method are not consistent. Optimal and stable estimates arti only possible if 
the sample sizes reached more than 1000 examinees and more than 50 Items. 

(3) Comparisons between the ordinary RM and the two- or three-parameter model 
are difficult. Only in the case of more than 30 items is a likelihood- ratio test 
similar to Andersen s test applicable. 

Contrary to the two- and three -parameter logistic model, the MRM includes, at least 
within the latent classes, all features of the ordinary RM, for example specific 
objectivity and stable parameter estimations by means of conditional maximum 
likelihood method, even if the number of items is small. The AIC and BIC-Index give 
an opportunity to compare different models with varying numbers of classes. 
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2.1 Guessing Behavior and MRM - a Simulation Study 

In the following simulation, we assume a given empirical data set containing the 
responses of N examinees on m items with J choices. Some of the examinees (a 
subsample of n [n<N) individugds) have applied a guessing strategy to all m items. 
Analyzing these data by means of the MRM we would expect two latent classes, one of 
which could be characterized as the "guessing class," with item expectation values 
equal to the random probability (p=J/J) and low or, for an indefinite number of 
persons, no variation between the item parameters. Another possible characteristic for 
this class is the expectation that, for an indefinite number of items, all individuals 
should have the same ability, i.e. no variance between the person parameters should 
occur. This means that within the "guessing class" the RM is reduced to a LCA model 
with only one class. Dealing with finite samples of m items and iV persons we would 
normally And a small variation of person and item parameters in empirical data, 
however, caused by random differences not directly related to ability. 

The item and individual parameters of the second claso should vary significantly, and 
the expectation values should deviate substantially from the random probability. For 
this group we would assume that we have measured the intended trait, e.g. scholastic 
achievement. To illustrate these assumptions and expectations we will analyze a 
simulated data set below. L^t us start with an intelligence test consisting of 24 Items 
with 5 choices. Here, the random probability of success is p=0.20. Suppose the 
existence of two latent classes, one of which contains guessing examinees, the other 
subjects who use their cognitive abilities to solve the items. The response probabilities 
(item means) of the 24 items in this latter clas^ are assumed to be overall p -0.7 for 
the first 8 items, p=:0.5 for the second 8 items and p=0.3 for the last 8 items. 
According to different "subjects' capabilities" these probabilities of success should 
vary between different ability groups, as shown in detail in Table 1. 

All members of group 1 have the highest probabilities of success, followed by group 2 
and so on up to group 5, i.e. the simulated abilities decrease from groups 1 to 5. In 
group 6 we assume the guessing strategy and fix all response probabilities at p=0.2, 
which is the random probability of solving a multiple choice item with five choices. 

The corresponding data for each item were fiist generated for 600 guessing examinees 
by means of a small simple BASIC program. Whether a subject of this class received a 
0 (item not solved) or a 1 (item solved) was decided by drawing random numbers from 
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the standard normal distribution. According to the value drawn the density of the 
standard normal distribution was calculated from infinite to the current value. If this 
density was lees than or equal to 0.8, the person's response on the Item was scored 0, 
otherwise It was scored 1 . 

Table I 

Response probablHtles and sample sizes of the 6 simulated groups. Group 6 consists of 

"guessing examinees." 



Group (v) 


p(xi = l|W^ 


p(X2 = l|^v)^ 


P(X3=l|^v)'= 




1 


0.9 


0.7 


0.5 


540 


2 


0.8 


O.S 


0.4 


540 


3 


0.7 


0.5 


0.3 


540 


4 


0.6 


0.4 


0.2 


540 


5 


0.5 


0.3 


0.1 


540 


6 


0.2 


0.2 


0.2 


600 



^Response probability of a subject In group v on an Item with an overall 

solving probability of p=0.7 In the "non-guessing group" 

l^Response probability of a subject In group v on an item with an overall 

vsolvlng probability of p=0.5 In the "non-guessing group" 

^Response probability of a subject In group v on an Item with an overall 

solving probability of p=0.3 In the "non-guessing group" 

^Number of persons per group 

The remaining simulated persons from class I up to class 5 were generated with the 
same procedure, but, depending on their supposed solving probabilities, the criterion 
for whether they received a 0 or a 1 was varied. For Instance members of group 1 
(with "high abilities") were scored 0 for the first 8 Items, if the density between Infinite 
and the value drawn was less than or equal to 0, 1 ; otherwise they received a 1 . As 
another example, members of group 5 (with "low abilities") were scored 0 for the last 8 
Items, If the density between infinite and the value drawn was less than or equal to 
0.9; otherwise they received a 1. 

This procedure was applied to all groups, resulting In a total of 3300 simulated 
subjects, 600 of them "guessing examinees". Their responses on the 24 simulated 
Items were analyzed with the PC program MIRA by Rost and Davler (1992). Table 2 
contains goodness-of-flt statistics for different solutions from one up to four classes. 
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Table 2 

Goodness-of-flt statistics for different solutions of the MlRA-analysls 









Biol 


1 


-48594.98 


47 


97570.70 


2 


-48270.50 


93 


97294.38 


3 


-48221.83 


139 


97569.68 


4 


48185.32 


185 


97869.29 



^number of latent clasaea; ^log-llkcllhood: ^number of estimated 
Independent parameters; ^Beat Information Criterion (Bozdogan, 
1987) 



According to the BIC-index the t-wo-class solution displays the best fit. The first class 
consists of 627 members, the second of 2673. The assignment of an examinee to the 
different classes was executed with resp>ect to his or hers response pattern; any 
subject was assigned to that latent class where, under the condition of his or hers 
response vector, the membership probability was highest. 6.4% of all persons In total 
were misclassifled, that Is, were assigned to the guessing class although they had 
been simulated as non-guessers, or were assigned to the non-guessing class despite 
being simulated as guessers.^ 

The interpretation of the two latent classes can be drawn from the graphical 
representation of the item expectation values in Figure 3. Congruent with the 
simulation suggestions, class 1 is characterized by item expectation values near the 
random probability of p=0.20. The corresponding item parameter estimations vary 
between ai^=-0.22 and Gij=+0.26'^. the variance of the individual parameters amounts 
to Vr^J=0.36. 

The second latent class also shows item expectation values in accordance with the 
simulated assumptions. The expectation values for the first 8 items are neer p=0.70, 
for the second 8 items near p=0.50 and for the last 8 items near p=0.30. The 
corresp)onding item parameters vary between a^=:-1.02 and a(g=+1.02. The variance of 



^ These mlsclassiflcatlons are caused by random processes: a guessing person, for Instance, can by chance 
reach a response vector, which will cause him or her to be assigned to the class of non-guessing examinees. 
^In MIRA the sum of all Item parameters is standardized to 0. In the case of an infinite number of 
examinees all Item paramters should be equal to 0 because of the above normlng condition. 
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the person parameters Is Vf^>=0.64, which is significantly higher (F(2699,599)=L78, 
p<0.001) than the variance in the guessing class. 



Expectation value 
1- 



0.8- 



0.6- 



0.4- 



0.2- 



■ class 1 
o class 2 




0-y » I ' I ■ ! ' t ' I ' I ' 1 ' I ' I ■ I ' I * 1 ' I ' ? ' I ■ T ' I ' I ' I ■ I ' I ' I ' I ' I 

1 2 3 4 5 6 7 8 0 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 

Item 



Figure 3 

Expectation values of the 24 simulated items within the two latent classes 



In summary, we have illustrated by means of a simulation study that the MRM is a 
potential tool to identify examinees who apply a guessing strategy to a set of multiple 
choice items. 



3. Identification of Guessing Behavior - an Empirical Study 

In our study 5641 7th grade students were tested with different scholastic 
achievement and intelligence tests, all of which were speed tests.^ As an example we 
analyzed the results in a biology test consisting of 23 multiple choice Items, each 
having 5 choices. Right answers were scored with 1, wrong anwers with 0. The 
investigated sample consists of Ar=2889 students (50.7% females) from a state of the 
former Federal Republic of Germany (FRG), and Ar=2752 students (53.4% females) 



^Thls current Investigation is part of a longitudinal study called "Educational Processes and Psycho-Social 
Development In Adolescence (BIJU)" which bejgan in 1991. The following Institutions are Involved: Institute 
for Science Education (IPN). Kiel: Max Planck Institute for Human Development (MPl). Berlin: Humboldt 
University. Berlin and Martin Luther UrUversity. Halle. The project leaders are Prof. Dr. J. Baumert (IPN) 
and Prof. Dr. P.M. Rocdcr (MPI). 
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from two states of the former German Democratic Republic (GDR). In accordance witti 
our assumptions, we expected at least two latent classes: 

Class 1: In this class, the 23 items should test the factor "biology knowledge". The 
item expectation values should deviate significantly from the random probability, and 
the amounts of variance between item parameters and between individual parameters 
should be significantly higher than those in class 2. 

Class 2: In this class we expected to find all those examinees who applied a guessing 
strategy to solve the tasks. Most of their item expectation values should be equal to 
the random probabilitj p=0.20. when five choices are present.^ The amount of 
variance of the item and individual parameters should be very small. 

Table 3 contains goodness -of-fit statistics for different MIRA-solutions from one to 
four classes. 

Table 3 

Goodness-of-flt statistics for different solutions of the MIRA-analysls for the biology test 





log Lb 




BICd 


1 


-75545.51 


45 


151479.64 


2 


-75100.37 


89 


150969.36 


3 


-74799.16 


133 


150747.02 


4 


-74648.18 


177 


150817.47 



^number of latent classes: *^Iog-llkellhood: ^number of estimated 
Independent parameters: *^Best Information Criterion (Bozdogan. 



1987) 

According to the BIC-index. the three-class solution fits the data better than all other 
solutions. Figure 4 shows the item expectation values for the three latent classes. 

Class 1 (26.1% of the whole sample) Is characterized by the fact that most of the item 
expectation values are similar to the random probabUity p=0.20. Only very easy items, 
(items 4. 6. 7. 8. 9 and 18). with expectation values greater than p=0.70 in the other 

^Oniy very easy Items with obvious solutions should form an exception. For these Items we assume 
expectation values greater than the random probability. 
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classes, have values substantially higher tia.n 0.20. This first clas.; might have used a 
guessing strategy to solve the items. Only eas> *asks provoked a temporary change in 
this strategy. The item paurgmieters vary Ix tw^een 0E<g=-0.86 and a(^=+1.94, the variance 
of the individual parameters is W^j=0.48. Th3 range of the item parameters is greater 
than that of the simulated data, which can be explained by the fact that particularly 
easy items in the empirical data did not provoke a guessmg strategy. The mean of 
solved items is M=7.75. 



E^xpectation value 




1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 

Item 

Class 1 (22-9%) 
-O- class 2 (8.8%) 
-A- class 3 (68.2%) 



Figure 4 

Item expectation values for the three-class solution from MIRA. 



Class 2 (64.9% of the whole sample) shows expectation values, all deviating from the 
random probability (p=0.20)- These results support the hypothesis that, in this latent 
class, we have actually tested the factor "biology knowledge" and not guessing 
behavior. The item parameters vary between aig=AA5 and c3c^=+3.04, the variance of 
the individual parameters is V(^)=0.&3, The mean of solved items is M~14.14 and 
therefore significantly higher than in class 1 (i=70.70. d/=5126, p=0.000). 
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ClaM 3 (8.9% of the whole sample) differs from class 2 both quantitatively and 
qualitatively, "Quantitative" means that these examinees show lower item expectation 
values for the first 14 items than those in class 2. This can be explained by a lower 
level of knowledge in class 3, "Qualitative" means that, beginning with item 15, a 
second trait, namely the processing speed, has influenced the item response 
probabilities. The test was administered as a speed test, and obviously the persons in 
class 3 were not able to solve the last items because of their slow processing speed. 
The item parameters in this class vary between aj^=-6.91 and a^=+4.29 and the 
variance of the individual parameters amounts to V(^)=0.62. The mean of solved 
ite as is M=8.36 which corresponds approximately to the mean in class 1 a^d is 
signiiicantly lower than in class 2 (t=40.32. d/=4442, p=0.000). 

In summary, the results of the MIRA-analysis support the hypothesis that a latent 
guessing class exists. This assumption is confirmed (1) by the profile of the item 
expectation values within this class and (2) by the reduced variance within it. Pairwise 
F- tests revealed that the variance within the guessing class was significantly smaller 
than within the other classes (p<0.001). 

3.1 The Relationship between "Guessing Behavior" and Cognitive 
and Motivational Variables 

The goal of the following step was to analyze the relationship between guessing 
behavior and cognitive, motivational and self-related variables. We did not analyze 
differences in cognitive variables among the three latent classes by means of an 
intelligence test* because several MIRA-analyses of subscales from the Intelligence 
Structure Test (in German: Intelligenzstrukturtest /ST, Amthauer. 1953) and from the 
Cognitive Ability TesUin German: Kognitiver Faehigkeitstest KFT 4''13, Heller. Gaedike 
& Weinlaeder. 1976) revealed that, again. difi*erent strategies were used to solve the 
items. Instead of an intelligence test, we analyzed differences in scholastic 
achievement measured by grades. 

Table 4 contains the results of our mean comparisons among the three latent classes 
with respect to the biology grade and the sum of grades hi mathematics and German. 
These variables were both standardized beforehand, so that values below zero indicate 
a low level of scholastic achievement (below the mean), and values above zero stand 
for a high level of achievement (above the mean). As expected, the guessing class 
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shows the lowest achievement level both in biology and in the combination of 
mathematics and Gennan- The highest achievement is shown by class 2, in which we 
measured the desired \. .able "biology knowledge", followed by class 3. which was 
characterized by a slow processing speed. 



Table 4 

Class means of different achievement variables and motivational factors 





class 1 


class 2 


class 3 


Math and Gennan grades 


-0.66 


0.20 


-0.32 


Biology grade 


-0.61 


0.19 


-0.34 


Biology -specific anxiety 


0.37 


-0.12 


0.06 


Self-concept of ability in biology 


-0.39 


0.13 


-0.10 



Oneway analyses of variance revealed that adl variables differ significantly among the 
three groups (p<0.01). Using Tuke/s hsd-test we obtained the result, that all palrwlse 
comparisons were significant (p<0.01) for all variables. 



Table 4 also provides information regarding the standardized means of the three 
latent classes with respect to the blologj^-speciflc anxiety and the self-concept of 
ability in biology. Anxiety was measured by means of a test by Helmke (1992) 
containing 23 items with 5 -point ratings for each item. The domain-specific self- 
concept of abiUty was tested by means of a short scale by Jerusalem (1984) consisting 
of 5 Items with 4-point ratin <s for each item. The reliability of both scales was 
satisfactory (Cronbach's a=.93 ^br the anxiety scale and a=.87 for the self-concept 
scale). Table 5 presents some item examples. These two affective variables were 
chosen to obtain information regarding whether the guessing students show 
characteristics which, in addition to a lower cognitive level, inhibite the learning and 
performing process in school. 

According to the self-concept of ability a meta-analysis by Hsmsford and Hattie (1982) 
shows a correlation coefficient of r=.42 between self-concept and scholastic 
achievement, which indicates a strong relationship between these two variables. An 
appropriate interpretation of this relationship is that a higher self-concept 
corresponds to a higher level of aspiration, which stimulates persistence in the 
students' learning processes and leads to higher knowledge and performance on 
achievement tests. 
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With respect to the relationship between anxiety and achievement, the conclusions of 
the relevant literature can be interpreted as indicating that a high level of anxietj^ does 
not have any strong positive or negative influence on the learning process but on the 
concrete performance situation, in which students have to solve tasks (Schnabel & 
Gruehn. 1993). 

Based on this theoretical background, the results summarized in Table 4 are in 
accordance with our expectations. Compared with the two other latent classes the 
guessing examinees expressed a higher level of anxiety and a lower self-concept of 
ability. It is plausible that the higher level of anxiety in the performance situation 
(solving the 23 Items of the biology test), combined with lower cognitive capabilities, 
provoked the guessing behavior. The random strategy might have been chosen to cope 
with the items, which were too difficult. 

Table 5 

Item examples for the scales self-concept of ability in biology (Jerusalem. 1984) and biology- 
specific anxiety (Helmke. 1992) 



Self-concept of ability in biology 

1. I would like biology if this subject were not so difficult. 

2, Even if I do my best in biology, I do not perform as well as the other students 
in my class, 

(ratings fi-om 1 ="do not agree" up to 4="strongly agree") 

Biology 'Sf)eclfvc anxiety 
Please remember the last class test in biology. How did you feel? 

1 . I doubted my abilities. 

2. I imagined who among the other students would perform worse than I. 
(ratings l=:"do not agree" up to 4=:"strongly agree") 



4. Summary and Discussion 

Psychological hypotheser. addressing different cognitive strategies often seem to be 
incompatible with psychometric models concerning test behavior. A common test 
model like the ordinary Rasch model, which assuii'es that a set of items measures tlie 
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same trait, Is not appUcable to the case where different cognitive strategies are used to 
solve the items. In the current study we introduced the MRM as a statistical 
possibiUty for detecting such different processing strategies in achievement tests. 
Applying the MRM to simulated and empirical data we tested and confirmed the 
hypothesis that at least two strategies can be applied to scholastic achievement testf 
consisting of multiple-choice items: a guessirag strategy and a strategy based on 
Knowledge. Further analyses provided information that guessing behavior is shown by 
students with lower-level cognitive abiUties and a higher level cf anxiety, who perhaps 
use the "random strategy" to cope with the items too difficult for them. 

In addition to this class with examinees who guess, we identified two other classes: 
one of which could be described simply as those persons whose responses were based 
on knowledge, the other of which could be characterized by the fact that their item 
response probabilties were influenced by two dimensions, namely biology knowledge 
and processing speed. This result further demonstrates that the MRM can also 
provide researchers with information about different processing strategies In different 
groups. 

As a consequence of our approach, the interpretation of students^ scores on 
achievement tests should Include two steps. In the first step researchers have to 
Identify the appUed strategy of ^ given examinee and In the second step calculate the 
Individual (abUlty) parameter for the Identified latent variable. The MRM allows such a 
qualitative and quantitative analysis of given response vectors. 
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